Combined Spectral Subtraction and Cepstral Normalisation for Robust Speech Recognition

نویسندگان

Haitian Xu

Zheng-Hua Tan

Paul Dalsgaard

Børge Lindberg

چکیده

This paper presents an effective feature processing algorithm for robust speech recognition, based on combined spectral and cepstral processing. The spectral processing consists of FullWave Rectification Spectral Subtraction (FWR-SS) and Likelihood Controlled Instantaneous Noise Estimation (LCINE) while the cepstral processing is based on meanand variance normalisation. The combination is motivated by the fact that the (usually) one frame based spectral subtraction introduces large statistical mismatches between clean and enhanced noisy speech in the cepstral domain, resulting in a degradation of the recognition performance. The introduced cepstral processing is able, to some extent, to mitigate these mismatches and in this sense the two methods are not just combined but shown to be complementary. Statistical analyses as well as recognition experiments are conducted on the Aurora 2 database and a performance comparable to the much more complex ETSI advanced front-end is achieved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving the performance of MFCC for Persian robust speech recognition

The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...

متن کامل

Forward masking on a generalized logarithmic scale for robust speech recognition

This paper examines the forward masking on the generalized logarithmic scale for robust speech recognition to both additive and convolutional noise. The forward masking in the dynamic cepstral (DyC) representation is based upon subtraction of a masking pattern from a current spectrum on a logarithmic spectral domain, whereas the proposed method intends to make a compromise between the logarithm...

متن کامل

Spectral Normalisation MFCC Derived Features for Robust Speech Recognition

This paper presents a method for extracting MFCC parameters from a normalised power spectrum density. The underlined spectral normalisation method is based on the fact that the speech regions with less energy need more robustness, since in these regions the noise is more dominant, thus the speech is more corrupted. Less energy speech regions contain usually sounds of unvoiced nature where are i...

متن کامل

A New Data Driven Method for Robust Speech Recognition

The conventional view on the problem of robustness in speech recognition is that performance degradation in ASR systems is due to mismatch between training and test conditions. If problem of robustness in ASR systems were considered as a mismatch between the training and testing conditions the solution would be to find a way to reduce it. Common approaches are: Data-Driven methods such as speec...

متن کامل

Distant-Talking Speech Recognition Based on Spectral Subtraction by Multi-Channel LMS Algorithm

We propose a blind dereverberation method based on spectral subtraction using a multi-channel least mean squares (MCLMS) algorithm for distant-talking speech recognition. In a distant-talking environment, the channel impulse response is longer than the short-term spectral analysis window. By treating the late reverberation as additive noise, a noise reduction technique based on spectral subtrac...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2005

Combined Spectral Subtraction and Cepstral Normalisation for Robust Speech Recognition

نویسندگان

چکیده

منابع مشابه

Improving the performance of MFCC for Persian robust speech recognition

Forward masking on a generalized logarithmic scale for robust speech recognition

Spectral Normalisation MFCC Derived Features for Robust Speech Recognition

A New Data Driven Method for Robust Speech Recognition

Distant-Talking Speech Recognition Based on Spectral Subtraction by Multi-Channel LMS Algorithm

عنوان ژورنال:

اشتراک گذاری